4,737 research outputs found

    Learning Determinantal Point Processes

    Get PDF
    Determinantal point processes (DPPs), which arise in random matrix theory and quantum physics, are natural models for subset selection problems where diversity is preferred. Among many remarkable properties, DPPs offer tractable algorithms for exact inference, including computing marginal probabilities and sampling; however, an important open question has been how to learn a DPP from labeled training data. In this paper we propose a natural feature-based parameterization of conditional DPPs, and show how it leads to a convex and efficient learning formulation. We analyze the relationship between our model and binary Markov random fields with repulsive potentials, which are qualitatively similar but computationally intractable. Finally, we apply our approach to the task of extractive summarization, where the goal is to choose a small subset of sentences conveying the most important information from a set of documents. In this task there is a fundamental tradeoff between sentences that are highly relevant to the collection as a whole, and sentences that are diverse and not repetitive. Our parameterization allows us to naturally balance these two characteristics. We evaluate our system on data from the DUC 2003/04 multi-document summarization task, achieving state-of-the-art results

    'Catastrophic Failure' Theories and Disaster Journalism: Evaluating Media Explanations of the Black Saturday Bushfires

    Get PDF
    In recent decades, academic researchers of natural disasters and emergency management have developed a canonical literature on 'catastrophe failure' theories such as disaster responses from from US emergency management services (Drabek, 2010; Quarantelli, 1998) and the Three Mile Island nuclear power plant (Perrow, 1999). This article examines six influential theories from this field in an attempt to explore why Victoria's disaster and emergency management response systems failed during Australia's Black Saturday bushfires. How well, if at all, are these theories understood by journalists, disaster and emergency management planners, and policy-makers? On examining the Country Fire Authority's response to the fires, as well as the media's reportage of them, we use the 2009 Black Saturday bushfires as a theory-testing case study of failures in emergency management, preparation and planning. We conclude that journalists can learn important lessons from academics' specialist knowledge about disaster and emergency management responses

    A quantum Mirkovi\'c-Vybornov isomorphism

    Full text link
    We present a quantization of an isomorphism of Mirkovi\'c and Vybornov which relates the intersection of a Slodowy slice and a nilpotent orbit closure in glN\mathfrak{gl}_N , to a slice between spherical Schubert varieties in the affine Grassmannian of PGLnPGL_n (with weights encoded by the Jordan types of the nilpotent orbits). A quantization of the former variety is provided by a parabolic W-algebra and of the latter by a truncated shifted Yangian. Building on earlier work of Brundan and Kleshchev, we define an explicit isomorphism between these non-commutative algebras, and show that its classical limit is a variation of the original isomorphism of Mirkovi\'c and Vybornov. As a corollary, we deduce that the W-algebra is free as a left (or right) module over its Gelfand-Tsetlin subalgebra, as conjectured by Futorny, Molev, and Ovsienko.Comment: v2: 48 pages. Major rewrite following referee comments. Added proof of a conjecture of Futorny, Molev, and Ovsienko that the finite W-algebra is free over its Gelfand-Tsetlin subalgebr

    Three-Way Joins on MapReduce: An Experimental Study

    Full text link
    We study three-way joins on MapReduce. Joins are very useful in a multitude of applications from data integration and traversing social networks, to mining graphs and automata-based constructions. However, joins are expensive, even for moderate data sets; we need efficient algorithms to perform distributed computation of joins using clusters of many machines. MapReduce has become an increasingly popular distributed computing system and programming paradigm. We consider a state-of-the-art MapReduce multi-way join algorithm by Afrati and Ullman and show when it is appropriate for use on very large data sets. By providing a detailed experimental study, we demonstrate that this algorithm scales much better than what is suggested by the original paper. However, if the join result needs to be summarized or aggregated, as opposed to being only enumerated, then the aggregation step can be integrated into a cascade of two-way joins, making it more efficient than the other algorithm, and thus becomes the preferred solution.Comment: 6 page

    Optimising Selective Sampling for Bootstrapping Named Entity Recognition

    Get PDF
    Training a statistical named entity recognition system in a new domain requires costly manual annotation of large quantities of in-domain data. Active learning promises to reduce the annotation cost by selecting only highly informative data points. This paper is concerned with a real active learning experiment to bootstrap a named entity recognition system for a new domain of radio astronomical abstracts. We evaluate several committee-based metrics for quantifying the disagreement between classifiers built using multiple views, and demonstrate that the choice of metric can be optimised in simulation experiments with existing annotated data from different domains. A final evaluation shows that we gained substantial savings compared to a randomly sampled baseline. 1
    corecore